Reduction of the dimension of a document space using the fuzzified output of a Kohonen network

نویسندگان

  • Vicente P. Guerrero Bote
  • Félix de Moya Anegón
چکیده

The vectors used in IR, whether to represent the documents or the terms, are high dimensional, and their dimensions increase as one approaches real problems. The algorithms used to manipulate them, however, consume enormously increasing amounts of computational capacity as the said dimension grows. We used the Kohonen algorithm and a fuzzification module to perform a fuzzy clustering of the terms. The degrees of membership obtained were used to represent the terms and, by extension, the documents, yielding a smaller number of components but still endowed with meaning. To test the results, we use a topological classification of sets of transformed and untransformed vectors to check that the same structure underlies both.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing Efficiency of Neural Network Model in Prediction of Firms Financial Crisis Using Input Space Dimension Reduction Techniques

The main focus in this study is on data pre-processing, reduction in number of inputs or input space size reduction the purpose of which is the justified generalization of data set in smaller dimensions without losing the most significant data. In case the input space is large, the most important input variables can be identified from which insignificant variables are eliminated, or a variable ...

متن کامل

An Unsupervised Learning Method for an Attacker Agent in Robot Soccer Competitions Based on the Kohonen Neural Network

RoboCup competition as a great test-bed, has turned to a worldwide popular domains in recent years. The main object of such competitions is to deal with complex behavior of systems whichconsist of multiple autonomous agents. The rich experience of human soccer player can be used as a valuable reference for a robot soccer player. However, because of the differences between real and simulated soc...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Controlling Both the DC Boost and AC Output Voltage of Z-Source Inverter Using Neural Network Controller with Minimization of Voltage Stress Across Devices

This paper presents a method to control both the dc boost and the ac output voltage of Z-source inverter using neural network controllers. The capacitor voltage of Z-source network has been controlled linearly in order to improve the transient response of the dc boost control of the Z-source inverter. The peak value of the line to line ac output voltage is used to control and keep the ac output...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2001